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Abstract 

The statistical properties of protein folding within the 0^ model are inves- 
tigated. The calculation is performed using statistical mechanics and path 
integral method. In particular, the evolution of heat capacity in term of 
temperature is given for various levels of the nonlinearity of source and the 
strength of interaction between protein backbone and nonlinear source. It is 
found that the nonlinear source contributes constructively to the specific heat 
especially at higher temperature when it is weakly interacting with the pro- 
tein backbone. This indicates increasing energy absorption as the intensity 
of nonlinear sources are getting greater. The simulation of protein folding 
dynamics within the model is also refined. 
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1. Introduction 

It is well known that the time ordered of protein folding is realized from 
the primary to the secondary and subsequent structures. Furthermore, the 
secondary structure consists of the shape representing each segment of a 
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polypeptide tied by hydrogen bonds, van der Walls forces, electrostatic in- 
teraction and hydrophobic effects. It is also formed around a group of amino 
acids considered as the ground state, and extended to include adjacent amino 
acids till the blocking amino acids are reached and the whole protein chain 
along the polypeptide adopted its preferred secondary structure. However, 
such mechanism has not yet been understood at the satisfactory level. For 
instance, the studies based on statistical analysis of identifying the proba- 
bilities of locating amino acids in each secondary structure are still at the 
level of less than 75% accuracy. Moreover, the main mechanism responsible 
for a structured folding pathway have also not yet been identified at all. Al- 
though, it is believed that such protein misfolding has been identified as the 
main cause of several diseases like cancers and so on [l| . 

On the other hand, some previous studies have shown that the nonlin- 
ear excitations could play an important role in conformational dynamics of 
protein backbone by decreasing the effective bindingrigidity of a biopolymer 
chain leading to a buckling instability of the chain [2| . The results motivate 
us to develop a model describing the conformational changes of protein based 
on the (j)^ theory [3]. The model is actually inspired by some previous models 
which attempt to reproduce the protein folding using nonlinear Schrodinger 
hamiltonian with the additional tension-like force 0, [sf . Those explain the 
transition of a protein from a metastable to its ground conformation induced 
by solitons, while the mediator of protein transition is the Davydov solitons 
propagating through the protein backbone [6]. It has been shown that our 
model could reproduce and improve such models more naturally from first 
principle using lagrangian formalism. Another known theoretical study for 
the conformational dynamics of biomolecules is the so-called ab initio quan- 
tum chemistry approach which, however requires astronomical computational 
power to deal with realistic biological systems 0, Isl . In contrary, along with 
the current model there are also some attempts to describe the dynamics 
in term of elementary biomatter using field theory approach joj and open 
quantum system 

This paper follows the same model in to describe the protein folding dy- 
namics. In contrast with the previous works adopting nonlinear Schrodinger 
equation and putting the required interactions by hand, e.g. 0, H, [sj, the 0^ 
term in the present model produces nonlinear Klein-Gordon equation as a 
source of disturbance, that is the (p^ self-interaction generates the nonlinear 
and tension force terms naturally [TT]. In the model, the protein backbone 
is assumed linear for the initial condition. Then the nonlinear bunch of the 
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light, like a laser, passed to the backbone. The interaction between the con- 
formational changes and nonlinear source leads to certain U(l) symmetry 
breaking. This would be the main source of protein folding. However more 
than investigating its dynamics as done in such previous works js], Isj, this 
paper deals with statistical properties involved in the process. In particular, 
the focus is put on the heat capacity in a certain volume, Cy-, representing the 
energy absorption against the temperature changes. The effect of nonlinear 
sources on Cy is investigated. 

The paper is organized as follows. First, the model and the underlying 
assumptions are briefly reviewed in detail in Sec. 2. It is then followed by 
the short derivation of relevant equation of motions (EOMs) as done in |3| 
and showing the refined numerical simulation of folding process within the 
model. In Sec. 4 the statistical mechanics properties are investigated in 
detail. Finally the paper is concluded with summary and discussion. 



2. The Models 

Let us briefly review the model proposed in our previous work jsj. The 
lagrangian density in the model is given as. 



hot = lc{4>) + ^s(^) + lint{<t>, 1p) 



(1) 



where, 



int 



{d,<P)^ (9^</)) + ml (0t0) 



A (0t0) (^t^) 



(2) 

(3) 
(4) 



representing the conformational changes of a protein backbone and the non- 
linear source injected to the backbone, while the last one is the interaction 
term between both. 

From the lagrangian, the total potential working in the system can be 
written as, 

Vii^A) = -7 (^V)' + A (^V) • (5) 
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Imposing a local U(l) symmetry to the total lagrangian and considering its 
minima lead to the vacuum expectation value (VEV), 

(^) = • (6) 

This non-zero VEV then yields the so-called spontaneous symmetry breaking. 
On the other hand, substituting {if)) into Eq. ^ induces the 'tension force' 
which plays an important role to enable folded pathways appear naturally. 
The symmetry breaking at the same time shifts the mass as follow, 

2A2 

mj-^mj^mj-— (0)2. (7) 

Roughly, (0) and are at the same order. Then one can obtain a constraint 
for the couplings as follow, 

1 - — > or 2A2 < A , (8) 
A 

to guarantee the positive masses. 

Now we are ready to move further on investigating the dynamics and 
statistical properties of protein folding within the model. 



3. Dynamics of EOMs 

Having the total lagrangian in Eq. ([T]) at hand, one can derive imme- 
diately respective EOMs using Euler-Lagrange equation in term of ij) and 

i. 
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Note that from now the natural unit is restored to make the light velocity 
(c) and h appear explicitly in the equations. 

Since the EOMs under consideration are coupled nonlinear partial differ- 
ential equations, then one should in principle solve them numerically. The 



numerical solution are done using the forward finite difference method [12 . 
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Both coupled EOMs in Eqs. (jH]) and (fTUj) are rewritten in explicit discrete 
forms as follows, 

22/' "U^i+l j — 210,, + Wi-i j 2 

i^jj+i = 2wij - tOij.i + c e I ■ - 2AuijWij 



-^rnt^^,j^ , (12) 

for i = 2, 3, ■ ■ ■ , — 1 and j = 2, 3, ■ • ■ , M — 1. In finite difference scheme, 
it is more convenient to replace ip and with u and w respectively. The 
following boundary conditions for both fields must be deployed. 



^(0,t) = = and 0(O,t) = = for 0<t<b, 

ip{x,0) = f{x) and (^(x, 0) = for 0<x<L 

dip{x,0) d(f){x,0) f n / / r 

= qix) and = qix) for < a; < L 

dt ^ ' dt ^ ' 



(13) 



with /(x), p(x), g{x) and q{x) are newly introduced auxiliary functions. The 
discretized value between these boundary conditions consists of (A^ — 1) x 
(M — 1) rectangles with side length Ax = b and At = e, where the side 
lengths must be very small to reduce truncation error. 

In order to calculate Eqs. (fTTj) and ( !T2|) across the whole region, two low- 
est initial values must be given. On the other hand, the value at t\ is fixed by 
the boundary conditions in Eq. ( !T3|) . The second order of Taylor expansion 
can also be used to determine the values in the second row. Therefore, the 
values at tg are determined by, 

= /. - e^. + ^ + ^ _ 2Ap?^. + , (14) 

Wi,2 = Pi - egi + I 2A/. v^ - jf^^.^Vi 1 , (15) 

for ? = 2, 3, ■ ■ ■ , A^ — 1. Initially, let us assume that the nonlinear source has 
a particular form of /(x) = 2sech(2x) e*^^ and g{x) = 1 to generate the a- 
helix, while g(x) = for the sake of simplicity. Then, one can obtain the next 
lowest initial values in this case using Eqs. f|T^ and f lT5|) . The subsequent 
values are generated by substituting the preceding values into Eqs. ffTTj) and 
f lT^ . The higher order values can be obtained using iterative procedure. 
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In this paper, the simulation is done using the following values for the 
relevant parameters (in natural units) : m = 0.08 eV, L = 2.364 nm, A = 
0.0028, A = 0.0003. It should be emphasized that these values satisfy the 
constrain in Eq. (|8]). The result is given in Fig. Ij This result is also 
a revised version of the previous one reported in j3| which contains some 
technical errors, although the conclusion remains the same. 

The left figure in each box describes the propagation of nonlinear sources 
in protein backbone, while the right one shows how the protein is folded. As 
can be seen in the figure, the protein backbone is initially linear before the 
nonlinear source injection. As the soliton started propagating over the back- 
bone, the conformational changes appear. It should be remarked that the 
result is obtained up to the second order accuracy in Taylor expansion. In 
order to guarantee that the numerical solutions do not contain large amount 
of truncation errors, the step sizes 6 and e are kept small enough. Neverthe- 
less, this should be good approximation to describe visually the mechanism 
of protein folding. 

4. Statistical mechanics 

Now let us discuss the main part of this paper. The statistical properties 
of a system with a particular lagrangian can be investigated through its 
partition function. It should be emphasized that the statistical observables 
hold only on an equilibrium which is fortunately guaranteed in the present 
case since the lagrangian under consideration is just the well known Klein- 
Gordon scalar lagrangian. 

The statistical observables can be conveniently calculated from the gener- 



ating functional by the perturbation method [13| . The generating functional 
for scalar fields is written as, 



V^pVt/jexp^ J d^x/tot(0,^)| . (16) 
The partition function can further be obtained from the generating functional 



by implementing a Wick rotation of the real axis [IJ], i.e. by defining the 
imaginary time it = t. Considering the finite time, the integral is performed 
between the range of —13/2 ^ P/2 with a periodicity condition of the field, 
that is 0(0, -f ) = 0(L, /3/2). Here, L is a fixed boundary of one dimensional 
space of protein backbone, while (3 = 1/T with T is the absolute temperature 
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in Kelvin. This specifically leads to the finite temperature case in Euclidean 
coordinates, 

Z = Jv(t)V^Pexpl^J^^ c/rc/x/tot(0,V^)| • (17) 

Following standard prescription in field theory, let us consider the vacuum 
transition amplitude in the presence of sources J(a;)'s. In this approach, the 
interactions can be represented by linear forms of the sources in term of the 
free particle lagrangian /q. 



Zq[J^{x),J^{x)] = j V(f)Viljexp (fx[lo{(f),ilj) + J^{x)(j){x) + J^{x)iIj{x)] 

(18) 

where lo{(j),tlj) = \d^(l)d^(l) + + ^d^ipd^ip and / (Px = drdx. 

Thereafter, the desired interactions are derived by taking its derivatives up 
to certain power with respect to J's at zero points. 

For instance, the ip'^ term in Eq. can be derived through the 4th 
derivative of Zo in Eq. (eq:ZO) with respect to at J<^,^ = 0, 

6^Z, 



= z^rix) . (19) 

One can perform the same procedure to obtain another interaction terms, 

5^Zo 



= Zo<p'V , (20) 

and so forth. This means one can represent the interactions terms in term of 
differential functional operators which then simplify the complete generating 
functional to be, 



exp < I a X \ — -TTTI + ^ 



4 5J^ 5J15JI 



Zo[J^,J^] . (21) 



4.I. Partition function 

The integral in Eq. ( fT7|l can be evaluated analytically using the Gaus- 
sian integral. This can be accomplished by rewriting it in term of Gaussian 
integral using the Fourier representation of Green's function, that is [l5| . 

(9^0) (9» = -000, and (d^^W^) = -^DV , (22) 
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with the D'Alembertian stands for, 



Substituting this result into Eq. f fTSl) yields, 



T)(f)Dil) exp < d X 



-^0 (□ + ml) </. - i^DV- + J00 + J^^ 



(23) 



4) 



Throughout the paper, for the sake of simplicity the fields are composed 
by its mean values corresponding to its classical trajectories and the quantum 
fluctuations around the mean value. Therefore, the fields can be expanded 
as Ei, 



-i/; = ip^x) + ip'{x) 



(25) 
(26) 



where and ip are the mean fields of the classical path while ip' is the 
dispersion of solutions. The variation of conformational field 0' is considered 
to be much less significant in the system. This is motivated by a fact that 
the protein is a classical matter with an infinitesimal dispersion relative to 
its mean value, i.e. its quantum aspect is negligible (0' = 0). The expression 
in Eq. becomes. 



Zo 



V(j)Vip exp < — d X 



(□ + ml) 



(27) 



Using similar argument to obtain Eq. (1221) . one can apply the relation 

J iljHip'd'^x = J tjj'Dtjjd'^x to get. 



J V(j)ViPexp j (^4> (□ + ml) 



(28) 



The classical path must satisfy the classical EOMs that are obtained from 
the lagrangian. 



□^(x) = J^{x) and (□ + m^) 0(2;) = J^{x) 



(29) 
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with the solutions 



tp{x) = J A^{x-y)J^{y)(fy and (pix) = J A^{x -y)J^{y)(fy , (30) 

where A{x — y) is the Feynman propagator. Substituting Eqs. (129!) and (!30!l 
into Eq. ([28]) yields, 

Zq = exp j d^xd^y[J^{x)A^{x - y)J^{y) + J^{x)A^{x - y)J^{y)]^ 

X j Vt/j'exp i^- j (fx^ip'D'^''^ . (31) 

Under this approximation, only ip' remains in the path integral and the result 
is just a number, namely N. 

Now the remaining task is calculating the transition amplitude by con- 
sidering the Taylor expansion of Eq. (13T1) , 



Zo = N + I £x(fy[J^{x)A^{x - y)J^{y) + J^{x)A^{x - y)J^{y)] 



+ 2! U 

X (fx(fy [J^{x)A^{x - y)J^{y) + J^{x)A^{x - y)J^{ 
+■■■}. (32) 

Considering the higher order derivatives to retrieve the interaction terms as 
discussed before, the survived terms are, 

1 /l^^ 



Zo ^ N-{- 
" 2! V2 



2 

t2„ 



X d xd y [J^{x)A^{x - y)J^{y) + J^(x)A^(a; - y)J^{y)]j (33) 
This result is substituted into Eq. fl2T]) to get, 

Z = Are«— , (34) 



where, 

/ A 5^ 

(37) 

With these notations, one can evaluate the survived term in e^K^, 



4 6J*^{x) 5Jl{x)5Jl{x) 
= j (fx {-6AAJ(0) + 8AA^(0)A^(0)} . (38) 



since, 



5k 



- j (fx2A^{x - X2)J^{x2) + J d'^xiJ^{xi)A^{xi - xl3i9) 
K<t> = JJJxj^ J ^^^2A0(x - a;2)J0(x2) + J (fxiJ^{xi)A^{xi-x){^0) 



5J^{x) 



'^^ = y7Jz:x = 2A^(0) ' (41) 

= tS^ = 2A,(0). (42) 



5J|(x) 
Finally, this leads to. 



/3 



Z — Nexp < dr dx 



-AA2(0) + AA^(0)A^(0) 
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(43) 



This is the master equation to investigate the statistical properties in the 
next subsection. 

4-2. Statistical observahles 

Let us consider the specific heat of the system in a constant volume, Cy, 
that is a particular interest from experimental point of view. The specific 
heat can be derived directly from the partition function using the relation, 
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In the present case, it is found to be. 

92 



\nN- I3L 



^AAJ(0)-AA<^(0)A^(0) 



(45) 



after performing the integration over r and x respectively, while the overall 
factor has been obtained as 13 , 



N = i . (46) 

47rsinh(A:/3/2) ^ ^ 

Next, one must find out the form of A(0). This can be achieved by solving 
the the Green function. Since the Feynman propagator A{x) obeys, 

92 92 



dr 

Q2 



9^2, ^^(-^-^ 



+ A<i(a;,r) 



and taking a particular form of Green functions [13] 

A^(x,r) = j- 
Aw,(x, r 



5{x)5{t) 
5{x)5{t) 



(47) 



^e^'"Av,(r) 
dq 



^e^^-A<^(r) 



(48) 



the imaginary-time propagators A(r) should satisfy the following differential 
equations, 

+ e\ A^r) = 5{t) 



ml 1 AJt) 



6{t). 



(49) 



Imposing the Dirichlet periodic boundary conditions. 



Mf)a..A(-f 



and using Eq. dH]), Eq. becomes pj], 

cosh(A;(/3/2- |r|)) 



AJt) 



2ksmh{kf3/2) 
cosh 



g2 + m2 (/3/2- |r|)) 



2 a/^^ + sinh ( P^/q"^ + m2/2 



(50) 

(51) 
(52) 
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Therefore the Fourier representation of Green functions can be written as 

- Mfe e-^- cosh(A:(/3/2-M)) 
A^(a;,rj - J— 2fc sinh(fc/3/2) ' ^^"^^ 

/• ^„ e^''^ coshfv/g2 + m2 {(3/2- \t\)] 

A,(x,r) = /-f ^ ^ . ( • (54) 

^ ^^2v/g2Tm2 sinh f/3v/?T^/2j 

As can be seen, the propagator A^{x,t) in Eq. fl54|) contains a factor 



of y g2 _|_ j77,2 -^vhich makes the integral cannot be performed analytically. In 

this paper, it is done numerically. 

Performing the integrals in Eqs. flS5]) and flM|) . and substituting the 
results into Eq. fH51) . one then obtains the numerical results as shown in 
Figs. [2] and |3] for various values of A and A, while k = 0.01. The values of 
another variables are the same as in Fig. [H 



5. Conclusion 

An extension of the phenomenological model describing the conforma- 
tional dynamics of proteins has been briefly reintroduced. The model based 
on the matter interactions among the conformational field and the nonlinear 
sources represented as the scalar bosonic fields </> and tp. As already shown in 
our previous work jsl, the nonlinear and tension force terms appear naturally 
from the scalar lagrangian with ip"^ self-interaction. Moreover, such forces are 
realized as a consequence of symmetry breaking. 

In the present paper, the numerical simulation of protein folding dynamics 
has been refined to revise some technical errors in the previously reported 
result However, the figure has only changed slightly, while the conclusion 
remains the same. 

Moreover, in the present paper the statistical properties of protein folding 
within the model are studied in detail. In particular, the specific heat, Cy, 
has been calculated analytically using statistical mechanics and path integral 
method. The evolution of Cy in term of temperature has been shown for var- 
ious levels of nonlinearity and interaction with nonlinear source represented 
by A and A. It is found that both of them contribute in an opposite way, and 
could completely cancel each other at certain values as Eq. ([6]) is fulfilled. 
This occurs when the symmetry is maximally broken. This also means that 
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increasing energy absorption prefers high level of nonlinearity of sources and 
at the same time weak interaction between the sources and protein backbone. 
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Figure 1: The soliton propagations and conformational changes on the protein backbone 
inducing protein folding. The vertical axis^in soliton evolution denotes time in second, 
while the horizontal axis denotes its amplitude. The conformational changes are on the 
{x, y, z) plane. 
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T 

Figure 2: Cv as a function of temperature for various values of A = 0.0003 (solid-line), 
0.0012 (dashed-line), 0.0021 (dashed-dotted-line), 0.0030 (dotted-line) with A 0.00283. 
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Figure 3: Cy as a function of temperature for A ~ 0.00283 (blue-line) and (red line) 
with A = 0.0003. 
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